Actor-Critic Methods
Back to Home
01. Introduction
02. Motivation
03. Bias and Variance
04. Two Ways for Estimating Expected Returns
05. Baselines and Critics
06. Policy-based, Value-Based, and Actor-Critic
07. A Basic Actor-Critic Agent
08. A3C: Asynchronous Advantage Actor-Critic, N-step
09. A3C: Asynchronous Advantage Actor-Critic, Parallel Training
10. A3C: Asynchronous Advantage Actor-Critic, Off- vs On-policy
11. A2C: Advantage Actor-Critic
12. A2C Code Walk-through
13. GAE: Generalized Advantage Estimation
14. DDPG: Deep Deterministic Policy Gradient, Continuous Actions
15. DDPG: Deep Deterministic Policy Gradient, Soft Updates
16. DDPG Code Walk-through
17. Summary
Back to Home
04. Two Ways for Estimating Expected Returns
Two Ways For Estimating Expected Returns
M3 L5 04 Two Ways For Estimating Expected Returns V3
Next Concept